Pesquisa | Portal Regional da BVS

1.

Standardised Versioning of Datasets: a FAIR-compliant Proposal.

González-Cebrián, Alba; Bradford, Michael; Chis, Adriana E; González-Vélez, Horacio.

Sci Data ; 11(1): 358, 2024 Apr 09.

Artigo em Inglês | MEDLINE | ID: mdl-38594314

RESUMO

This paper presents a standardised dataset versioning framework for improved reusability, recognition and data version tracking, facilitating comparisons and informed decision-making for data usability and workflow integration. The framework adopts a software engineering-like data versioning nomenclature ("major.minor.patch") and incorporates data schema principles to promote reproducibility and collaboration. To quantify changes in statistical properties over time, the concept of data drift metrics (d) is introduced. Three metrics (dP, dE,PCA, and dE,AE) based on unsupervised Machine Learning techniques (Principal Component Analysis and Autoencoders) are evaluated for dataset creation, update, and deletion. The optimal choice is the dE,PCA metric, combining PCA models with splines. It exhibits efficient computational time, with values below 50 for new dataset batches and values consistent with seasonal or trend variations. Major updates (i.e., values of 100) occur when scaling transformations are applied to over 30% of variables while efficiently handling information loss, yielding values close to 0. This metric achieved a favourable trade-off between interpretability, robustness against information loss, and computation time.

Assuntos

Conjuntos de Dados como Assunto , Software , Análise de Componente Principal , Reprodutibilidade dos Testes , Fluxo de Trabalho , Conjuntos de Dados como Assunto/normas , Aprendizado de Máquina

2.

MSLTE: multiple self-supervised learning tasks for enhancing EEG emotion recognition.

Li, Guangqiang; Chen, Ning; Niu, Yixiang; Xu, Zhangyong; Dong, Yuxuan; Jin, Jing; Zhu, Hongqin.

J Neural Eng ; 21(2)2024 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-38588700

RESUMO

Objective. The instability of the EEG acquisition devices may lead to information loss in the channels or frequency bands of the collected EEG. This phenomenon may be ignored in available models, which leads to the overfitting and low generalization of the model.Approach. Multiple self-supervised learning tasks are introduced in the proposed model to enhance the generalization of EEG emotion recognition and reduce the overfitting problem to some extent. Firstly, channel masking and frequency masking are introduced to simulate the information loss in certain channels and frequency bands resulting from the instability of EEG, and two self-supervised learning-based feature reconstruction tasks combining masked graph autoencoders (GAE) are constructed to enhance the generalization of the shared encoder. Secondly, to take full advantage of the complementary information contained in these two self-supervised learning tasks to ensure the reliability of feature reconstruction, a weight sharing (WS) mechanism is introduced between the two graph decoders. Thirdly, an adaptive weight multi-task loss (AWML) strategy based on homoscedastic uncertainty is adopted to combine the supervised learning loss and the two self-supervised learning losses to enhance the performance further.Main results. Experimental results on SEED, SEED-V, and DEAP datasets demonstrate that: (i) Generally, the proposed model achieves higher averaged emotion classification accuracy than various baselines included in both subject-dependent and subject-independent scenarios. (ii) Each key module contributes to the performance enhancement of the proposed model. (iii) It achieves higher training efficiency, and significantly lower model size and computational complexity than the state-of-the-art (SOTA) multi-task-based model. (iv) The performances of the proposed model are less influenced by the key parameters.Significance. The introduction of the self-supervised learning task helps to enhance the generalization of the EEG emotion recognition model and eliminate overfitting to some extent, which can be modified to be applied in other EEG-based classification tasks.

Assuntos

Eletroencefalografia , Emoções , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina Supervisionado/normas , Conjuntos de Dados como Assunto , Humanos

3.

Revealing uncertainty in the status of biodiversity change.

Johnson, T F; Beckerman, A P; Childs, D Z; Webb, T J; Evans, K L; Griffiths, C A; Capdevila, P; Clements, C F; Besson, M; Gregory, R D; Thomas, G H; Delmas, E; Freckleton, R P.

Nature ; 628(8009): 788-794, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38538788

RESUMO

Biodiversity faces unprecedented threats from rapid global change1. Signals of biodiversity change come from time-series abundance datasets for thousands of species over large geographic and temporal scales. Analyses of these biodiversity datasets have pointed to varied trends in abundance, including increases and decreases. However, these analyses have not fully accounted for spatial, temporal and phylogenetic structures in the data. Here, using a new statistical framework, we show across ten high-profile biodiversity datasets2-11 that increases and decreases under existing approaches vanish once spatial, temporal and phylogenetic structures are accounted for. This is a consequence of existing approaches severely underestimating trend uncertainty and sometimes misestimating the trend direction. Under our revised average abundance trends that appropriately recognize uncertainty, we failed to observe a single increasing or decreasing trend at 95% credible intervals in our ten datasets. This emphasizes how little is known about biodiversity change across vast spatial and taxonomic scales. Despite this uncertainty at vast scales, we reveal improved local-scale prediction accuracy by accounting for spatial, temporal and phylogenetic structures. Improved prediction offers hope of estimating biodiversity change at policy-relevant scales, guiding adaptive conservation responses.

Assuntos

Biodiversidade , Filogenia , Incerteza , Animais , Conservação dos Recursos Naturais/tendências , Conjuntos de Dados como Assunto , Fatores de Tempo , Análise Espaço-Temporal

4.

A dedicated structured data set for reporting of invasive carcinoma of the breast in the setting of neoadjuvant therapy: recommendations from the International Collaboration on Cancer Reporting (ICCR).

Bossuyt, Veerle; Provenzano, Elena; Symmans, W Fraser; Webster, Fleur; Allison, Kimberly H; Dang, Chau; Gobbi, Helenice; Kulka, Janina; Lakhani, Sunil R; Moriya, Takuya; Quinn, Cecily M; Sapino, Anna; Schnitt, Stuart; Sibbering, D Mark; Slodkowska, Elzbieta; Yang, Wentao; Tan, Puay Hoon; Ellis, Ian.

Histopathology ; 84(7): 1111-1129, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38443320

RESUMO

AIMS: The International Collaboration on Cancer Reporting (ICCR), a global alliance of major (inter-)national pathology and cancer organisations, is an initiative aimed at providing a unified international approach to reporting cancer. ICCR recently published new data sets for the reporting of invasive breast carcinoma, surgically removed lymph nodes for breast tumours and ductal carcinoma in situ, variants of lobular carcinoma in situ and low-grade lesions. The data set in this paper addresses the neoadjuvant setting. The aim is to promote high-quality, standardised reporting of tumour response and residual disease after neoadjuvant treatment that can be used for subsequent management decisions for each patient. METHODS: The ICCR convened expert panels of breast pathologists with a representative surgeon and oncologist to critically review and discuss current evidence. Feedback from the international public consultation was critical in the development of this data set. RESULTS: The expert panel concluded that a dedicated data set was required for reporting of breast specimens post-neoadjuvant therapy with inclusion of data elements specific to the neoadjuvant setting as core or non-core elements. This data set proposes a practical approach for handling and reporting breast resection specimens following neoadjuvant therapy. The comments for each data element clarify terminology, discuss available evidence and highlight areas with limited evidence that need further study. This data set overlaps with, and should be used in conjunction with, the data sets for the reporting of invasive breast carcinoma and surgically removed lymph nodes from patients with breast tumours, as appropriate. Key issues specific to the neoadjuvant setting are included in this paper. The entire data set is freely available on the ICCR website. CONCLUSIONS: High-quality, standardised reporting of tumour response and residual disease after neoadjuvant treatment are critical for subsequent management decisions for each patient.

Assuntos

Neoplasias da Mama , Terapia Neoadjuvante , Humanos , Neoplasias da Mama/patologia , Neoplasias da Mama/terapia , Feminino , Conjuntos de Dados como Assunto

5.

Disappearing cities on US coasts.

Ohenhen, Leonard O; Shirzaei, Manoochehr; Ojha, Chandrakanta; Sherpa, Sonam F; Nicholls, Robert J.

Nature ; 627(8002): 108-115, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38448695

RESUMO

The sea level along the US coastlines is projected to rise by 0.25-0.3 m by 2050, increasing the probability of more destructive flooding and inundation in major cities1-3. However, these impacts may be exacerbated by coastal subsidence-the sinking of coastal land areas4-a factor that is often underrepresented in coastal-management policies and long-term urban planning2,5. In this study, we combine high-resolution vertical land motion (that is, raising or lowering of land) and elevation datasets with projections of sea-level rise to quantify the potential inundated areas in 32 major US coastal cities. Here we show that, even when considering the current coastal-defence structures, further land area of between 1,006 and 1,389 km2 is threatened by relative sea-level rise by 2050, posing a threat to a population of 55,000-273,000 people and 31,000-171,000 properties. Our analysis shows that not accounting for spatially variable land subsidence within the cities may lead to inaccurate projections of expected exposure. These potential consequences show the scale of the adaptation challenge, which is not appreciated in most US coastal cities.

Assuntos

Altitude , Cidades , Planejamento de Cidades , Inundações , Movimento (Física) , Elevação do Nível do Mar , Cidades/estatística & dados numéricos , Planejamento de Cidades/métodos , Planejamento de Cidades/tendências , Inundações/prevenção & controle , Inundações/estatística & dados numéricos , Estados Unidos , Conjuntos de Dados como Assunto , Elevação do Nível do Mar/estatística & dados numéricos , Aclimatação

6.

Integrative cross-species analysis reveals conserved and unique signatures in fatty skeletal muscles.

Wang, Liyi; Zhou, Yanbing; Wang, Yizhen; Shan, Tizhong.

Sci Data ; 11(1): 290, 2024 Mar 12.

Artigo em Inglês | MEDLINE | ID: mdl-38472209

RESUMO

Fat infiltration in skeletal muscle is now recognized as a standard feature of aging and is directly related to the decline in muscle function. However, there is still a limited systematic integration and exploration of the mechanisms underlying the occurrence of myosteatosis in aging across species. Here, we re-analyzed bulk RNA-seq datasets to investigate the association between fat infiltration in skeletal muscle and aging. Our integrated analysis of single-nucleus transcriptomics in aged humans and Laiwu pigs with high intramuscular fat content, identified species-preference subclusters and revealed core gene programs associated with myosteatosis. Furthermore, we found that fibro/adipogenic progenitors (FAPs) had potential capacity of differentiating into PDE4D+/PDE7B+ preadipocytes across species. Additionally, cell-cell communication analysis revealed that FAPs may be associated with other adipogenic potential clusters via the COL4A2 and COL6A3 pathways. Our study elucidates the correlation mechanism between aging and fat infiltration in skeletal muscle, and these consensus signatures in both humans and pigs may contribute to increasing reproducibility and reliability in future studies involving in the field of muscle research.

Assuntos

Adipogenia , Envelhecimento , Músculo Esquelético , Idoso , Animais , Humanos , Adipogenia/fisiologia , Diferenciação Celular , Músculo Esquelético/fisiologia , Suínos , Conjuntos de Dados como Assunto , RNA-Seq , Transcriptoma , Adipócitos , Células-Tronco

7.

Single-cell integrative analysis reveals consensus cancer cell states and clinical relevance in breast cancer.

Pang, Lin; Xiang, Fengyu; Yang, Huan; Shen, Xinyue; Fang, Ming; Li, Ran; Long, Yongjin; Li, Jiali; Yu, Yonghuan; Pang, Bo.

Sci Data ; 11(1): 289, 2024 Mar 12.

Artigo em Inglês | MEDLINE | ID: mdl-38472225

RESUMO

High heterogeneity and complex interactions of malignant cells in breast cancer has been recognized as a driver of cancer progression and therapeutic failure. However, complete understanding of common cancer cell states and their underlying driver factors remain scarce and challenging. Here, we revealed seven consensus cancer cell states recurring cross patients by integrative analysis of single-cell RNA sequencing data of breast cancer. The distinct biological functions, the subtype-specific distribution, the potential cells of origin and the interrelation of consensus cancer cell states were systematically elucidated and validated in multiple independent datasets. We further uncovered the internal regulons and external cell components in tumor microenvironments, which contribute to the consensus cancer cell states. Using the state-specific signature, we also inferred the abundance of cells with each consensus cancer cell state by deconvolution of large breast cancer RNA-seq cohorts, revealing the association of immune-related state with better survival. Our study provides new insights for the cancer cell state composition and potential therapeutic strategies of breast cancer.

Assuntos

Neoplasias da Mama , Análise de Célula Única , Feminino , Humanos , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Relevância Clínica , Microambiente Tumoral , Conjuntos de Dados como Assunto , Análise de Sequência de RNA

8.

On the ethics of informed consent in genetic data collected before 1997.

Zieger, Martin; Joly, Yann; D'Amato, Maria Eugenia.

Nature ; 627(8003): 271, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38472325

Assuntos

Coleta de Dados , Genética Humana , Consentimento Livre e Esclarecido , Humanos , Conjuntos de Dados como Assunto/ética , Conjuntos de Dados como Assunto/história , História do Século XX , Genética Humana/ética , Genética Humana/história , Consentimento Livre e Esclarecido/ética , Consentimento Livre e Esclarecido/história , Coleta de Dados/ética , Coleta de Dados/história

9.

Genomic data in the All of Us Research Program.

Nature ; 627(8003): 340-346, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38374255

RESUMO

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics1-4. The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health5,6. Here we describe the programme's genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.

Assuntos

Conjuntos de Dados como Assunto , Genética Médica , Genética Populacional , Genoma Humano , Genômica , Grupos Minoritários , Grupos Raciais , Humanos , Acesso à Informação , População Negra/genética , Registros Eletrônicos de Saúde , Etnicidade/genética , População Europeia/genética , Predisposição Genética para Doença/genética , Variação Genética/genética , Genoma Humano/genética , Estudos Longitudinais , Grupos Raciais/genética , Reprodutibilidade dos Testes , Pesquisadores , Fatores de Tempo , Populações Vulneráveis

10.

Mutualisms weaken the latitudinal diversity gradient among oceanic islands.

Delavaux, Camille S; Crowther, Thomas W; Bever, James D; Weigelt, Patrick; Gora, Evan M.

Nature ; 627(8003): 335-339, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38418873

RESUMO

The latitudinal diversity gradient (LDG) dominates global patterns of diversity1,2, but the factors that underlie the LDG remain elusive. Here we use a unique global dataset3 to show that vascular plants on oceanic islands exhibit a weakened LDG and explore potential mechanisms for this effect. Our results show that traditional physical drivers of island biogeography4-namely area and isolation-contribute to the difference between island and mainland diversity at a given latitude (that is, the island species deficit), as smaller and more distant islands experience reduced colonization. However, plant species with mutualists are underrepresented on islands, and we find that this plant mutualism filter explains more variation in the island species deficit than abiotic factors. In particular, plant species that require animal pollinators or microbial mutualists such as arbuscular mycorrhizal fungi contribute disproportionately to the island species deficit near the Equator, with contributions decreasing with distance from the Equator. Plant mutualist filters on species richness are particularly strong at low absolute latitudes where mainland richness is highest, weakening the LDG of oceanic islands. These results provide empirical evidence that mutualisms, habitat heterogeneity and dispersal are key to the maintenance of high tropical plant diversity and mediate the biogeographic patterns of plant diversity on Earth.

Assuntos

Biodiversidade , Mapeamento Geográfico , Ilhas , Plantas , Simbiose , Animais , Conjuntos de Dados como Assunto , Micorrizas/fisiologia , Plantas/microbiologia , Polinização , Clima Tropical , Oceanos e Mares , Filogeografia

11.

The effect of data resampling methods in radiomics.

Demircioglu, Aydin.

Sci Rep ; 14(1): 2858, 2024 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-38310165

RESUMO

Radiomic datasets can be class-imbalanced, for instance, when the prevalence of diseases varies notably, meaning that the number of positive samples is much smaller than that of negative samples. In these cases, the majority class may dominate the model's training and thus negatively affect the model's predictive performance, leading to bias. Therefore, resampling methods are often utilized to class-balance the data. However, several resampling methods exist, and neither their relative predictive performance nor their impact on feature selection has been systematically analyzed. In this study, we aimed to measure the impact of nine resampling methods on radiomic models utilizing a set of fifteen publicly available datasets regarding their predictive performance. Furthermore, we evaluated the agreement and similarity of the set of selected features. Our results show that applying resampling methods did not improve the predictive performance on average. On specific datasets, slight improvements in predictive performance (+ 0.015 in AUC) could be seen. A considerable disagreement on the set of selected features was seen (only 28.7% of features agreed), which strongly impedes feature interpretability. However, selected features are similar when considering their correlation (82.9% of features correlated on average).

Assuntos

Análise de Dados , 60570 , Conjuntos de Dados como Assunto

12.

Supplemental Nutrition Assistance Program and Adherence to Antihypertensive Medications.

Islam, Md Mohaimenul; Oyarzun-Gonzalez, Ximena; Bose-Brill, Seuli; Donneyong, Macarius M.

JAMA Netw Open ; 7(2): e2356619, 2024 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-38393731

RESUMO

Importance: Nonadherence to antihypertensive medications is associated with uncontrolled blood pressure, higher mortality rates, and increased health care costs, and food insecurity is one of the modifiable medication nonadherence risk factors. The Supplemental Nutrition Assistance Program (SNAP), a social intervention program for addressing food insecurity, may help improve adherence to antihypertensive medications. Objective: To evaluate whether receipt of SNAP benefits can modify the consequences of food insecurity on nonadherence to antihypertensive medications. Design, Setting, and Participants: A retrospective cohort study design was used to assemble a cohort of antihypertensive medication users from the linked Medical Expenditure Panel Survey (MEPS)-National Health Interview Survey (NHIS) dataset for 2016 to 2017. The MEPS is a national longitudinal survey on verified self-reported prescribed medication use and health care access measures, and the NHIS is an annual cross-sectional survey of US households that collects comprehensive health information, health behavior, and sociodemographic data, including receipt of SNAP benefits. Receipt of SNAP benefits in the past 12 months and food insecurity status in the past 30 days were assessed through standard questionnaires during the study period. Data analysis was performed from March to December 2021. Exposure: Status of SNAP benefit receipt. Main Outcomes and Measures: The main outcome, nonadherence to antihypertensive medication refill adherence (MRA), was defined using the MEPS data as the total days' supply divided by 365 days for each antihypertensive medication class. Patients were considered nonadherent if their overall MRA was less than 80%. Food insecurity status in the 30 days prior to the survey was modeled as the effect modifier. Inverse probability of treatment (IPT) weighting was used to control for measured confounding effects of baseline covariates. A probit model was used, weighted by the product of the computed IPT weights and MEPS weights, to estimate the population average treatment effects (PATEs) of SNAP benefit receipt on nonadherence. A stratified analysis approach was used to assess for potential effect modification by food insecurity status. Results: This analysis involved 6692 antihypertensive medication users, of whom 1203 (12.8%) reported receiving SNAP benefits and 1338 (14.8%) were considered as food insecure. The mean (SD) age was 63.0 (13.3) years; 3632 (51.3%) of the participants were women and 3060 (45.7%) were men. Although SNAP was not associated with nonadherence to antihypertensive medications in the overall population, it was associated with a 13.6-percentage point reduction in nonadherence (PATE, -13.6 [95% CI, -25.0 to -2.3]) among the food-insecure subgroup but not among their food-secure counterparts. Conclusions and Relevance: This analysis of a national observational dataset suggests that patients with hypertension who receive SNAP benefits may be less likely to become nonadherent to antihypertensive medication, especially if they are experiencing food insecurity. Further examination of the role of SNAP as a potential intervention for preventing nonadherence to antihypertensive medications through prospectively designed interventional studies or natural experiment study designs is needed.

Assuntos

Assistência Alimentar , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Anti-Hipertensivos/uso terapêutico , Estudos Transversais , Pobreza , Estudos Retrospectivos , Idoso , Conjuntos de Dados como Assunto

13.

An update of skin permeability data based on a systematic review of recent research.

Chedik, Lisa; Baybekov, Shamkhal; Cosnier, Frédéric; Marcou, Gilles; Varnek, Alexandre; Champmartin, Catherine.

Sci Data ; 11(1): 224, 2024 Feb 21.

Artigo em Inglês | MEDLINE | ID: mdl-38383523

RESUMO

The cutaneous absorption parameters of xenobiotics are crucial for the development of drugs and cosmetics, as well as for assessing environmental and occupational chemical risks. Despite the great variability in the design of experimental conditions due to uncertain international guidelines, datasets like HuskinDB have been created to report skin absorption endpoints. This review updates available skin permeability data by rigorously compiling research published between 2012 and 2021. Inclusion and exclusion criteria have been selected to build the most harmonized and reusable dataset possible. The Generative Topographic Mapping method was applied to the present dataset and compared to HuskinDB to monitor the progress in skin permeability research and locate chemotypes of particular concern. The open-source dataset (SkinPiX) includes steady-state flux, maximum flux, lag time and permeability coefficient results for the substances tested, as well as relevant information on experimental parameters that can impact the data. It can be used to extract subsets of data for comparisons and to build predictive models.

Assuntos

Absorção Cutânea , Pele , Xenobióticos , Permeabilidade , Pele/metabolismo , Xenobióticos/metabolismo , Conjuntos de Dados como Assunto , Humanos

14.

Harmonizing government responses to the COVID-19 pandemic.

Cheng, Cindy; Messerschmidt, Luca; Bravo, Isaac; Waldbauer, Marco; Bhavikatti, Rohan; Schenk, Caress; Grujic, Vanja; Model, Tim; Kubinec, Robert; Barceló, Joan.

Sci Data ; 11(1): 204, 2024 Feb 14.

Artigo em Inglês | MEDLINE | ID: mdl-38355867

RESUMO

Public health and safety measures (PHSM) made in response to the COVID-19 pandemic have been singular, rapid, and profuse compared to the content, speed, and volume of normal policy-making. Not only can they have a profound effect on the spread of the disease, but they may also have multitudinous secondary effects, in both the social and natural worlds. Unfortunately, despite the best efforts by numerous research groups, existing data on COVID-19 PHSM only partially captures their full geographical scale and policy scope for any significant duration of time. This paper introduces our effort to harmonize data from the eight largest such efforts for policies made before September 21, 2021 into the taxonomy developed by the CoronaNet Research Project in order to respond to the need for comprehensive, high quality COVID-19 data. In doing so, we present a comprehensive comparative analysis of existing data from different COVID-19 PHSM datasets, introduce our novel methodology for harmonizing COVID-19 PHSM data, and provide a clear-eyed assessment of the pros and cons of our efforts.

Assuntos

COVID-19 , Pandemias , Formulação de Políticas , Humanos , Governo , Saúde Pública , Conjuntos de Dados como Assunto

15.

A comprehensive database of exosome molecular biomarkers and disease-gene associations.

Qi, Yue; Xu, Rongji; Song, Chengxin; Hao, Ming; Gao, Yue; Xin, Mengyu; Liu, Qian; Chen, Hongyan; Wu, Xiaoting; Sun, Rui; Zhang, Yuanfu; He, Danni; Dai, Yifan; Kong, Congcong; Ning, Shangwei; Guo, Qiuyan; Zhang, Guangmei; Wang, Peng.

Sci Data ; 11(1): 210, 2024 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-38360815

RESUMO

Exosomes play a crucial role in intercellular communication and can be used as biomarkers for diagnostic and therapeutic clinical applications. However, systematic studies in cancer-associated exosomal nucleic acids remain a big challenge. Here, we developed ExMdb, a comprehensive database of exosomal nucleic acid biomarkers and disease-gene associations curated from published literature and high-throughput datasets. We performed a comprehensive curation of exosome properties including 4,586 experimentally supported gene-disease associations, 13,768 diagnostic and therapeutic biomarkers, and 312,049 nucleic acid subcellular locations. To characterize expression variation of exosomal molecules and identify causal factors of complex diseases, we have also collected 164 high-throughput datasets, including bulk and single-cell RNA sequencing (scRNA-seq) data. Based on these datasets, we performed various bioinformatics and statistical analyses to support our conclusions and advance our knowledge of exosome biology. Collectively, our dataset will serve as an essential resource for investigating the regulatory mechanisms of complex diseases and improving the development of diagnostic and therapeutic biomarkers.

Assuntos

Conjuntos de Dados como Assunto , Exossomos , Neoplasias , Ácidos Nucleicos , Humanos , Biomarcadores , Biomarcadores Tumorais , Biologia Computacional , Exossomos/genética , Neoplasias/diagnóstico , Neoplasias/genética , Ácidos Nucleicos/genética , Bases de Dados Genéticas

16.

Deep Learning and Machine Learning Algorithms for Retinal Image Analysis in Neurodegenerative Disease: Systematic Review of Datasets and Models.

Bahr, Tyler; Vu, Truong A; Tuttle, Jared J; Iezzi, Raymond.

Transl Vis Sci Technol ; 13(2): 16, 2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-38381447

RESUMO

Purpose: Retinal images contain rich biomarker information for neurodegenerative disease. Recently, deep learning models have been used for automated neurodegenerative disease diagnosis and risk prediction using retinal images with good results. Methods: In this review, we systematically report studies with datasets of retinal images from patients with neurodegenerative diseases, including Alzheimer's disease, Huntington's disease, Parkinson's disease, amyotrophic lateral sclerosis, and others. We also review and characterize the models in the current literature which have been used for classification, regression, or segmentation problems using retinal images in patients with neurodegenerative diseases. Results: Our review found several existing datasets and models with various imaging modalities primarily in patients with Alzheimer's disease, with most datasets on the order of tens to a few hundred images. We found limited data available for the other neurodegenerative diseases. Although cross-sectional imaging data for Alzheimer's disease is becoming more abundant, datasets with longitudinal imaging of any disease are lacking. Conclusions: The use of bilateral and multimodal imaging together with metadata seems to improve model performance, thus multimodal bilateral image datasets with patient metadata are needed. We identified several deep learning tools that have been useful in this context including feature extraction algorithms specifically for retinal images, retinal image preprocessing techniques, transfer learning, feature fusion, and attention mapping. Importantly, we also consider the limitations common to these models in real-world clinical applications. Translational Relevance: This systematic review evaluates the deep learning models and retinal features relevant in the evaluation of retinal images of patients with neurodegenerative disease.

Assuntos

Doença de Alzheimer , Aprendizado Profundo , Doenças Neurodegenerativas , Retina , Humanos , Algoritmos , Doença de Alzheimer/diagnóstico por imagem , Aprendizado de Máquina , Doenças Neurodegenerativas/diagnóstico por imagem , Conjuntos de Dados como Assunto , Retina/diagnóstico por imagem

17.

Analysis of large data sets now available in certain sports (and more).

Meyer, Tim.

J Sci Med Sport ; 27(2): 71, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38316498

Assuntos

Esportes , Humanos , Conjuntos de Dados como Assunto

18.

Accurate diagnosis of COVID-19 from lung CT images using transfer learning.

Tas, H G; Tas, M B H; Irgul, B; Aydin, S; Kuyrukluyildiz, U.

Eur Rev Med Pharmacol Sci ; 28(3): 1213-1226, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38375726

RESUMO

OBJECTIVE: In this study, it is aimed to classify data by feature extraction from tomographic images for the diagnosis of COVID-19 using image processing and transfer learning. MATERIALS AND METHODS: In the proposed study, CT images are made better detectable by artificial intelligence through preliminary processes such as masking and segmentation. Then, the number of data was increased by applying data augmentation. The size of the dataset contains a large number of images in numerical terms. Therefore, the results of the models are more reliable. The dataset is split into 70% training and 30% testing. In this way, different features of the applied models were found, and positive effects were achieved on the result. Transfer Learning was used to reduce training times and further increase the success rate. To find the best method, many different pre-trained Transfer Learning models have been tried and compared with many different studies. RESULTS: A total of 8,354 images were used in the research. Of these, 2,695 consist of COVID-19 patients and the remaining healthy chest tomography images. All of these images were given to the models through masking and segmentation processes. As a result of the experimental evaluation, the best model was determined to be ResNet-50 and the highest results were found (accuracy 95.7%, precision 94.7%, recall 99.2%, specificity 88.3%, F1 score 96.9%, ROC-AUC score 97%). CONCLUSIONS: The presence of a COVID-19 lesion in the images was identified with high accuracy and recall rate using the transfer learning model we developed using thorax CT images. This outcome demonstrates that the strategy will speed up the diagnosis of COVID-19.

Assuntos

Inteligência Artificial , COVID-19 , Humanos , COVID-19/diagnóstico por imagem , Teste para COVID-19 , Pulmão/diagnóstico por imagem , Aprendizado de Máquina , Tomografia Computadorizada por Raios X , Conjuntos de Dados como Assunto

19.

Brain tumor segmentation using synthetic MR images - A comparison of GANs and diffusion models.

Usman Akbar, Muhammad; Larsson, Måns; Blystad, Ida; Eklund, Anders.

Sci Data ; 11(1): 259, 2024 Feb 29.

Artigo em Inglês | MEDLINE | ID: mdl-38424097

RESUMO

Large annotated datasets are required for training deep learning models, but in medical imaging data sharing is often complicated due to ethics, anonymization and data protection legislation. Generative AI models, such as generative adversarial networks (GANs) and diffusion models, can today produce very realistic synthetic images, and can potentially facilitate data sharing. However, in order to share synthetic medical images it must first be demonstrated that they can be used for training different networks with acceptable performance. Here, we therefore comprehensively evaluate four GANs (progressive GAN, StyleGAN 1-3) and a diffusion model for the task of brain tumor segmentation (using two segmentation networks, U-Net and a Swin transformer). Our results show that segmentation networks trained on synthetic images reach Dice scores that are 80%-90% of Dice scores when training with real images, but that memorization of the training images can be a problem for diffusion models if the original dataset is too small. Our conclusion is that sharing synthetic medical images is a viable option to sharing real images, but that further work is required. The trained generative models and the generated synthetic images are shared on AIDA data hub.

Assuntos

Neoplasias Encefálicas , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Processamento de Imagem Assistida por Computador , Disseminação de Informação , Conjuntos de Dados como Assunto

20.

Fertilizer management for global ammonia emission reduction.

Xu, Peng; Li, Geng; Zheng, Yi; Fung, Jimmy C H; Chen, Anping; Zeng, Zhenzhong; Shen, Huizhong; Hu, Min; Mao, Jiafu; Zheng, Yan; Cui, Xiaoqing; Guo, Zhilin; Chen, Yilin; Feng, Lian; He, Shaokun; Zhang, Xuguo; Lau, Alexis K H; Tao, Shu; Houlton, Benjamin Z.

Nature ; 626(8000): 792-798, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38297125

RESUMO

Crop production is a large source of atmospheric ammonia (NH3), which poses risks to air quality, human health and ecosystems1-5. However, estimating global NH3 emissions from croplands is subject to uncertainties because of data limitations, thereby limiting the accurate identification of mitigation options and efficacy4,5. Here we develop a machine learning model for generating crop-specific and spatially explicit NH3 emission factors globally (5-arcmin resolution) based on a compiled dataset of field observations. We show that global NH3 emissions from rice, wheat and maize fields in 2018 were 4.3 ± 1.0 Tg N yr-1, lower than previous estimates that did not fully consider fertilizer management practices6-9. Furthermore, spatially optimizing fertilizer management, as guided by the machine learning model, has the potential to reduce the NH3 emissions by about 38% (1.6 ± 0.4 Tg N yr-1) without altering total fertilizer nitrogen inputs. Specifically, we estimate potential NH3 emissions reductions of 47% (44-56%) for rice, 27% (24-28%) for maize and 26% (20-28%) for wheat cultivation, respectively. Under future climate change scenarios, we estimate that NH3 emissions could increase by 4.0 ± 2.7% under SSP1-2.6 and 5.5 ± 5.7% under SSP5-8.5 by 2030-2060. However, targeted fertilizer management has the potential to mitigate these increases.

Assuntos

Amônia , Produção Agrícola , Fertilizantes , Amônia/análise , Amônia/metabolismo , Produção Agrícola/métodos , Produção Agrícola/estatística & dados numéricos , Produção Agrícola/tendências , Conjuntos de Dados como Assunto , Ecossistema , Fertilizantes/efeitos adversos , Fertilizantes/análise , Fertilizantes/estatística & dados numéricos , Aprendizado de Máquina , Nitrogênio/análise , Nitrogênio/metabolismo , Oryza/metabolismo , Solo/química , Triticum/metabolismo , Zea mays/metabolismo , Mudança Climática/estatística & dados numéricos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA